WISE 2014 Challenge: Multi-label Classification of Print Media Articles to Topics

نویسندگان

  • Grigorios Tsoumakas
  • Apostolos N. Papadopoulos
  • Weining Qian
  • Stavros Vologiannidis
  • Alexander D'yakonov
  • Antti Puurula
  • Jesse Read
  • Jan Svec
  • Stanislav Semenov
چکیده

The WISE 2014 challenge was concerned with the task of multi-label classification of articles coming from Greek print media. Raw data comes from the scanning of print media, article segmentation, and optical character segmentation, and therefore is quite noisy. Each article is examined by a human annotator and categorized to one or more of the topics being monitored. Topics range from specific persons, products, and companies that can be easily categorized based on keywords, to more general semantic concepts, such as environment or economy. Building multi-label classifiers for the automated annotation of articles into topics can support the work of human annotators by suggesting a list of all topics by order of relevance, or even automate the annotation process for media and/or categories that are easier to predict. This saves valuable time and allows a media monitoring company to expand the portfolio of media being monitored. This paper summarizes the approaches of the top 4 among the 121 teams that participated in the competition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

Multi-label Classification Using Hypergraph Orthonormalized Partial Least Squares

In many real-world applications, humangenerated data like images are often associated with several semantic topics simultaneously, called multi-label data, which poses a great challenge for classification in such scenarios. Since the topics are always not independent, it is very useful to respect the correlations among different topics for performing better classification on multi-label data. H...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

Content Analysis of Media Coverage of Childhood Obesity Topics in UAE Newspapers and Popular Social Media Platforms, 2014-2017

The 2017 prevalence of obesity among children (age 5–17 years) in the United Arab Emirates (UAE) is 13.68%. Childhood obesity is one of the 10 top health priorities in the UAE. This study examines the quality, frequency, sources, scope and framing of childhood obesity in popular social media and three leading UAE newspapers from 2014 to 2017. During the review period, 152 newspaper articles fro...

متن کامل

SAPKOS: Experimental Czech Multi-label Document Classification and Analysis System

This paper presents an experimental multi-label document classification and analysis system called SAPKOS. The system which integrates the state-of-the-art machine learning and natural language processing approaches is intended to be used by the Czech news Agency (ČTK). Its main purpose is to save human resources in the task of annotation of newspaper articles with topics. Another important fun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014